19 research outputs found
Interactive Teaching Algorithms for Inverse Reinforcement Learning
We study the problem of inverse reinforcement learning (IRL) with the added
twist that the learner is assisted by a helpful teacher. More formally, we
tackle the following algorithmic question: How could a teacher provide an
informative sequence of demonstrations to an IRL learner to speed up the
learning process? We present an interactive teaching framework where a teacher
adaptively chooses the next demonstration based on learner's current policy. In
particular, we design teaching algorithms for two concrete settings: an
omniscient setting where a teacher has full knowledge about the learner's
dynamics and a blackbox setting where the teacher has minimal knowledge. Then,
we study a sequential variant of the popular MCE-IRL learner and prove
convergence guarantees of our teaching algorithm in the omniscient setting.
Extensive experiments with a car driving simulator environment show that the
learning progress can be speeded up drastically as compared to an uninformative
teacher.Comment: IJCAI'19 paper (extended version
Transitions, Losses, and Re-parameterizations: Elements of Prediction Games
This thesis presents some geometric insights into three different
types of two-player prediction games – namely general learning
task, prediction with expert advice, and online convex
optimization. These games differ in the nature of the opponent
(stochastic, adversarial, or intermediate), the order of the
players' move, and the utility function. The insights shed some
light on the understanding of the intrinsic barriers of the
prediction problems and the design of computationally efficient
learning algorithms with strong theoretical guarantees (such as
generalizability, statistical consistency, and constant regret
etc.). The main contributions of the thesis are:
• Leveraging concepts from statistical decision theory, we
develop a necessary toolkit for formalizing the prediction games
mentioned above and quantifying the objective of them.
• We investigate the cost-sensitive classification problem
which is an instantiation of the general learning task, and
demonstrate the hardness of this problem by producing the lower
bounds on the minimax risk of it.
Then we analyse the impact of imposing constraints (such as
corruption level, and privacy requirements etc.) on the general
learning task. This naturally leads us to further investigation
of strong data processing inequalities which is a fundamental
concept in information theory.
Furthermore, by extending the hypothesis testing interpretation
of standard privacy definitions, we propose an asymmetric
(prioritized) privacy definition.
• We study efficient merging schemes for prediction with expert
advice problem and the geometric properties (mixability and
exp-concavity) of the loss functions that guarantee constant
regret bounds. As a result of our study, we construct two types
of link functions (one using calculus approach and another using
geometric approach) that can re-parameterize any binary mixable
loss into an exp-concave loss.
• We focus on some recent algorithms for online convex
optimization, which exploit the easy nature of the data (such as
sparsity, predictable sequences, and curved losses) in order to
achieve better regret bound while ensuring the protection against
the worst case scenario. We unify some of these existing
techniques to obtain new update rules for the cases when these
easy instances occur together, and analyse the regret bounds of
them